Goto

Collaborating Authors

 structure determination


XDXD: End-to-end crystal structure determination with low resolution X-ray diffraction

Zhao, Jiale, Liu, Cong, Zhang, Yuxuan, Gong, Chengyue, Zhang, Zhenyi, Jin, Shifeng, Liu, Zhenyu

arXiv.org Artificial Intelligence

Determining crystal structures from X-ray diffraction data is fundamental across diverse scientific fields, yet remains a significant challenge when data is limited to low resolution. While recent deep learning models have made breakthroughs in solving the crystallographic phase problem, the resulting low-resolution electron density maps are often ambiguous and difficult to interpret. To overcome this critical bottleneck, we introduce XDXD, to our knowledge, the first end-to-end deep learning framework to determine a complete atomic model directly from low-resolution single-crystal X-ray diffraction data. Our diffusion-based generative model bypasses the need for manual map interpretation, producing chemically plausible crystal structures conditioned on the diffraction pattern. We demonstrate that XDXD achieves a 70.4\% match rate for structures with data limited to 2.0~Å resolution, with a root-mean-square error (RMSE) below 0.05. Evaluated on a benchmark of 24,000 experimental structures, our model proves to be robust and accurate. Furthermore, a case study on small peptides highlights the model's potential for extension to more complex systems, paving the way for automated structure solution in previously intractable cases.



Adaptive Multimodal Protein Plug-and-Play with Diffusion-Based Priors

Banerjee, Amartya, Xu, Xingyu, Moosmüller, Caroline, Lee, Harlin

arXiv.org Artificial Intelligence

In an inverse problem, the goal is to recover an unknown parameter (e.g., an image) that has typically undergone some lossy or noisy transformation during measurement. Recently, deep generative models, particularly diffusion models, have emerged as powerful priors for protein structure generation. However, integrating noisy experimental data from multiple sources to guide these models remains a significant challenge. Existing methods often require precise knowledge of experimental noise levels and manually tuned weights for each data modality. In this work, we introduce Adam-PnP, a Plug-and-Play framework that guides a pre-trained protein diffusion model using gradients from multiple, heterogeneous experimental sources. Our framework features an adaptive noise estimation scheme and a dynamic modality weighting mechanism integrated into the diffusion process, which reduce the need for manual hyperparameter tuning. Experiments on complex reconstruction tasks demonstrate significantly improved accuracy using Adam-PnP.


Solving Inverse Problems in Protein Space Using Diffusion-Based Priors

Levy, Axel, Chan, Eric R., Fridovich-Keil, Sara, Poitevin, Frédéric, Zhong, Ellen D., Wetzstein, Gordon

arXiv.org Artificial Intelligence

The interaction of a protein with its environment can be understood and controlled via its 3D structure. Experimental methods for protein structure determination, such as X-ray crystallography or cryogenic electron microscopy, shed light on biological processes but introduce challenging inverse problems. Learning-based approaches have emerged as accurate and efficient methods to solve these inverse problems for 3D structure determination, but are specialized for a predefined type of measurement. Here, we introduce a versatile framework to turn raw biophysical measurements of varying types into 3D atomic models. Our method combines a physics-based forward model of the measurement process with a pretrained generative model providing a task-agnostic, data-driven prior. Our method outperforms posterior sampling baselines on both linear and non-linear inverse problems. In particular, it is the first diffusion-based method for refining atomic models from cryo-EM density maps.


Deep learning for reconstructing protein structures from cryo-EM density maps: recent advances and future directions

Giri, Nabin, Roy, Raj S., Cheng, Jianlin

arXiv.org Artificial Intelligence

Deep learning for reconstructing protein structures from cryo-EM density maps: recent advances and future directions Nabin Giri, Raj S. Roy, Jianlin Cheng Deep learning is a promising technique for efficient, automatic, and accurate reconstruction of protein structures from cryo-EM density maps Advanced convolutional neural networks and U-Nets have been successfully applied to reconstruct protein structures from high-resolution cryo-EM density maps Creating high-quality cryo-EM data sets for training and testing deep learning methods is important and there is a significant need of curating such data sets to facilitate the development of deep learning methods Better structure reconstruction can be obtained by combining AlphaFold predicted structure models and cryo-EM data and by integrating cryo-EM based structure determination techniques and protein structure prediction techniques.


Whither structural biologists?

#artificialintelligence

Between December 2020 and July 2021, several spectacular developments in the field of protein-structure prediction changed structural biology profoundly, and they are expected to have an impact on much of modern (molecular) biology, medicine, biochemistry and biotechnology. The unprecedented accuracy of blind protein-structure predictions produced by DeepMind's AlphaFold2 was revealed at the CASP 14 meeting in December 2020. In July 2021, this was followed by publication of the method and release of the code (Jumper et al., 2021). Simultaneously, a prediction method from the Baker lab that achieved similar accuracy was published (Baek et al., 2021). A week later, an additional publication described proteome-scale application of protein-structure prediction using AlphaFold2.


Sequence-guided protein structure determination using graph convolutional and recurrent networks

Li, Po-Nan, de Oliveira, Saulo H. P., Wakatsuki, Soichi, Bedem, Henry van den

arXiv.org Machine Learning

Single particle, cryogenic electron microscopy (cryo-EM) experiments now routinely produce high-resolution data for large proteins and their complexes. Building an atomic model into a cryo-EM density map is challenging, particularly when no structure for the target protein is known a priori. Existing protocols for this type of task often rely on significant human intervention and can take hours to many days to produce an output. Here, we present a fully automated, template-free model building approach that is based entirely on neural networks. We use a graph convolutional network (GCN) to generate an embedding from a set of rotamer-based amino acid identities and candidate 3-dimensional C$\alpha$ locations. Starting from this embedding, we use a bidirectional long short-term memory (LSTM) module to order and label the candidate identities and atomic locations consistent with the input protein sequence to obtain a structural model. Our approach paves the way for determining protein structures from cryo-EM densities at a fraction of the time of existing approaches and without the need for human intervention.


A^2-Net: Molecular Structure Estimation from Cryo-EM Density Volumes

Xu, Kui, Wang, Zhe, Shi, Jiangping, Li, Hongsheng, Zhang, Qiangfeng Cliff

arXiv.org Machine Learning

Constructing of molecular structural models from Cryo-Electron Microscopy (Cryo-EM) density volumes is the critical last step of structure determination by Cryo-EM technologies. Methods have evolved from manual construction by structural biologists to perform 6D translation-rotation searching, which is extremely compute-intensive. In this paper, we propose a learning-based method and formulate this problem as a vision-inspired 3D detection and pose estimation task. We develop a deep learning framework for amino acid determination in a 3D Cryo-EM density volume. We also design a sequence-guided Monte Carlo Tree Search (MCTS) to thread over the candidate amino acids to form the molecular structure. This framework achieves 91% coverage on our newly proposed dataset and takes only a few minutes for a typical structure with a thousand amino acids. Our method is hundreds of times faster and several times more accurate than existing automated solutions without any human intervention.


New machine-learning algorithms may revolutionize drug discovery -- and our understanding of life

#artificialintelligence

A new set of machine-learning algorithms can generate 3D structures of complex nanoscale protein molecules like this complex proteasome map refined to 2.8 Angstroms (.28 nanometer) in 70 min with 49,954 particle images (credit: Structura Biotechnology Inc.) A new set of machine-learning algorithms developed by researchers at the University of Toronto Scarborough can generate 3D structures of nanoscale protein molecules that could not be achieved in the past. The algorithms may revolutionize the development of new drug therapies for a range of diseases and may even lead to better understand how life works at the atomic level, the researchers say. Drugs work by binding to a specific protein molecule and changing the protein's 3D shape, which alters the way the drug works once inside the body. The ideal drug is designed in a shape that will only bind to a specific protein or group of proteins that are involved in a disease, while eliminating side effects that occur when drugs bind to other proteins in the body.